Improved regret for zeroth-order adversarial bandit convex optimisation

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback

The study of online convex optimization in the bandit setting was initiated by Kleinberg (2004) and Flaxman et al. (2005). Such a setting models a decision maker that has to make decisions in the face of adversarially chosen convex loss functions. Moreover, the only information the decision maker receives are the losses. The identities of the loss functions themselves are not revealed. In this ...

متن کامل

Bandit Convex Optimization: √ T Regret in One Dimension

We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is Θ̃( √ T ) and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setti...

متن کامل

Bandit Convex Optimization: \(\sqrt{T}\) Regret in One Dimension

We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is Θ̃( √ T ) and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setti...

متن کامل

Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits

We give an oracle-based algorithm for the adversarial contextual bandit problem, where either contexts are drawn i.i.d. or the sequence of contexts is known a priori, but where the losses are picked adversarially. Our algorithm is computationally efficient, assuming access to an offline optimization oracle, and enjoys a regret of order O((KT ) 2 3 (logN) 1 3 ), where K is the number of actions,...

متن کامل

On Zeroth-Order Stochastic Convex Optimization via Random Walks

We propose a method for zeroth order stochastic convex optimization that attains the suboptimality rate of Õ(n7T−1/2) after T queries for a convex bounded function f : R → R. The method is based on a random walk (the Ball Walk) on the epigraph of the function. The randomized approach circumvents the problem of gradient estimation, and appears to be less sensitive to noisy function evaluations c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical Statistics and Learning

سال: 2020

ISSN: 2520-2316

DOI: 10.4171/msl/17